Entanglement verification with finite data 
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Suppose an experimentalist wishes to verify that his apparatus produces entangled quantum 
states. A finite amount of data cannot conclusively demonstrate entanglement, so drawing conclu- 
sions from real-world data requires statistical reasoning. We propose a reliable method to quantify 
the weight of evidence for (or against) entanglement, based on a likelihood ratio test. Our method 
is universal in that it can be applied to any sort of measurements. We demonstrate the method by 
applying it to two simulated experiments on two qubits. The first measures a single entanglement 
witness, while the second performs a tomographically complete measurement. 



Entanglement is an essential resource for quantum in- 
formation processing, and producing and verifying entan- 
gled states is considered a benchmark for quantum exper- 
iments (for a sample from the most recent experiments 
on a wide variety of physical systems, see jl ). Several 
methods for verifying entanglement have been developed 
(for overviews, see [2l|3]). A bipartite state is entangled if 
it is not separable, and data V demonstrate entanglement 
if there is no separable state that could have generated 
them. As the number of data TV ^ oo, the data are 
unambiguous, but for finite N^ only probabilistic conclu- 
sions can be drawn. In this Letter, we quantify exactly 
what can be concluded from finite or small data sets, 
using a simple and efficient likelihood ratio test. 

We demonstrate the method using two simulated ex- 
periments on two-qubit systems [12j. The first mea- 
sures just one observable, an entanglement witness [4 . 
The other performs a tomographically complete measure- 
ment. In both cases, we use likelihood ratios to draw 
direct conclusions about entanglement, rather than esti- 
mating the quantum state as an intermediate step. A 
related technique for testing violation of local realism, 
and based on empirical relative entropy instead of the 
likelihood ratio, was proposed by van Dam et al [5 and 
applied by Zhang et al jG]. 

Likelihood Ratios: Data V could have been generated 
by any one of many i.i.d. states p®^ . Each state p rep- 
resents a theory about the system, and the relative plau- 
sibility of different states is measured by their likelihood 
jC{p). a state's likelihood is simply the probability of the 
observed data given that state. 



£(p) = PviV\p), 



(1) 



and states with higher likelihood are more plausible. If 
the most likely state is separable, the data clearly do 
not support entanglement. If it is entangled, then we 
need to ask how convincing the data are - specifically, 
whether some separable state is almost as plausible. To 
judge whether there is (even just one) separable state 
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FIG. 1: General schema of a likelihood ratio test. The 

separable states S (cyan) are a convex subset of all states, sur- 
rounded by entangled states (red). Data from an experiment 
on a state p yield a quasiconvex likelihood function [(a)] with 
a unique maximum (pmle). Pmle is randomly distributed 
around p, at a typical length scale 6 — 0{1/VN). If pmle is 
separable then there is no evidence for entanglement, but if 
it's entangled (as shown), then the relative likelihoods of pmle 
and the most likely separable state determine the weight of 
evidence. Data are "convincing" if they are very unlikely to 
have been produced by a borderline separable state. Typical 
likelihood ratios for such states depend on the shape of S. 
In (b)-(d) we show three possible cases: in (b) S is smaller 
than S and behaves like a point; in (c) it is of size 6 and its 
behavior is hard to characterize; in (d) it is much bigger than 
S and behaves like a half-space. 



that fits the data, we compare the likelihoods of (i) the 
most likely separable state, and (ii) the most likely of all 
states. Letting S be the set of separable states, we define 



A; 



maxaii pjC{p)' 
A is a likelihood ratio^ and 

A = -2 log A 



(2) 



(3) 



represents the weight of evidence in favor of entangle- 
ment [13\ To demonstrate entanglement convincingly, 
an experiment must yield a sufficiently large value for A. 

A likelihood ratio does not assign a probability to "p 
is entangled". Instead, it yields a confidence level. We 
can determine what values of A typically result from mea- 
surements on p*^^, and how their distribution depends 
on whether p is entangled or separable. If we measure 
A = Aexp, and no separable state produces A > Aexp with 
probability higher than e, then we have demonstrated 
entanglement at the 1 — e confidence level. If an exper- 
imentalist plans {before taking data) to calculate A and 
report "p is entangled" only when the data imply 1 — e 
confidence, then the probability that he erroneously re- 
ports entanglement [M] is at most e. 

So, p may be (i) entangled, (ii) separable, or (iii) on 
the boundary. Boundary states are still separable, and 
they are the hardest separable states to rule out. To 
demonstrate entanglement at the 1 — e confidence level, 
we must show that there is no boundary state for which 
Pr(A > Aexp) ^ e. It is difficult to make rigorous proba- 
bilistic statements about A for small N. But as A/" ^ oo, 
the following analysis becomes exact, and is generally 
thought to be rehable for TV > 30 ^7]. 

The distribution of A: The set of quantum states 
p is a convex subset of the vector space of trace- 1 
d X d Hermitian operators, R^ ~^. An entanglement- 
verification measurement is represented by a POVM 
(positive operator- valued measure) M = {Ei . . . Em}, in 
which each operator E^ represents an event that occurs 
with probability pk = TiE^p (Born's rule), and each p 
defines a probability distribution p = {pi . . .pm}- Data 
in which Ek appeared Uk times define empirical frequen- 
cies / = {/i . . . /m}, where fk = j^- Both p and /can 
be represented as elements of an TTi-simplex embedded 
in a vector space R^~^. The probabilities in p may be 
linearly dependent (e.g., if Ej +£^/c = 11, then pj -\-pk = 1 
for all p), and at most d^ — 1 of them can be independent 
(because p contains only d^ — 1 parameters). We define 
dim(A^) as the number of independent probabilities. 

So Born's rule defines a linear mapping from the opera- 
tor space containing quantum states into the probability 
space for measurement M. If dim(A^) < d^ — 1, then the 
mapping from states to p- vectors is many-to-one, and the 
experiment is completely insensitive to some parameters 
of p. Ignoring these irrelevant parameters makes p an 
(effectively) dim(A^)-dimensional parameter. Separable 
states form a convex subset of all states (see Fig. [l]). 
These sets' images in probability space are also nested 
convex sets (although if dim(7W) < d^ — 1, then some 
entangled states will be indistinguishable from separable 
ones in this experiment). 

Suppose that A^ copies of a state po are measured, 
yielding a likelihood function C{p). C{p) has a unique 
global maximum Pmle- As A" -^ oo, the distribution 
of Pmle approaches a Gaussian around po with covari- 
ance tensor A. C{p) itself is a Gaussian function with 



the same covariance matrix A (see note [l5]). This de- 
fines a characteristic length scale S = |A|2 that scales 
as (5 = 0(l/>/]V). We can use A to define a stretched 
Euclidean metric 



d{pi,p2) = \/Tr[(pi -P2)A i(pi - P2) 



(4) 



Using this metric, Pmle is univariate Gaussian dis- 
tributed around po, and 



log C{p) = 



d{p,PMLE) 



(5) 



Thus, A is determined entirely by (i(pMLE,5), the dis- 
tance from Pmle to the separable set S. If po is demon- 
strably entangled, then A will grow proportional to A^ - 
but if it is indistinguishable from a separable state, then 
A will converge almost certainly to zero (see Figure [2]). 

When Po is on the boundary, A neither grows with 
A" nor converges to zero, but continues to fluctuate as 
A' ^ 00. Its distribution is controlled by the shape and 
radius of 5, e.g.: 

1. If 5 is small w/r.t. (5, it behaves like a point 
(see Figure fl]3). Then c^(pmle,5) ^ o^(pmle,Po), 
A = -2 log (JCmaxmpo)) = c^(p,/5mle)^, and so A 
is a x^ random variable with dim(A4) degrees of 



freedom (a.k.a. a x^ 



im(Al) 



variable). 



2. If 5 is much larger than J, then it behaves like a 
half-space (see Figure[l]i and note [16 ). If 5 were a 
/c-dimensional hyperplane, A would be a Xdimi M)-k 
variable. A halfspace behaves like a hyperplane of 
dimension (dim(A^) — 1), except with probability 
^, Pmle is separable. Thus, A is what we will call 
a semi-Xi variable: it equals zero with probability 
^, and is Xi-distributed otherwise. 

As A" ^ 00, case (2) applies. For small A", however, the 
real situation is somewhere in between (see Figure Ilfc). S 
may be small, and its boundary may be sharply curved, 
increasing A. In the absence of a detailed understanding 
of <S's shape, case (1) provides the best rigorous upper 
bound on A. Its cumulative distribution is upper bounded 
by that of a Xdim(M) variable - i.e., Pr(A > x) is no 
greater than it would be if A was a XdimfA^) variable. As 
A' ^ 00, the more optimistic semi-Xi ansatz is valid - 
but only if we know that A' is "large enough" . 

A Xk variable has expected value /c, and higher val- 
ues are exponentially suppressed. So A ::^ dim(A4) is 
sufficient to demonstrate entanglement at a high confi- 
dence level. This implies a tradeoff between an experi- 
ment's power (ability to identify many entangled states) 
and its efficiency (ability to do so rapidly). Powerful ex- 
periments have large dimension - e.g., a tomographically 
complete measurement can identify any entangled state, 
but has dim(AI) = d^ — 1. This comes at a price; exper- 
iments with large dimension are potentially much more 
prone to spurious large values of A, so more data is re- 
quired to achieve conclusive results [A ^ dim(A4)]. Con- 
versely, an entanglement witness (see below) is targeted 
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FIG. 2: Loglikelihood ratios (A) behave dramatically 
differently for different states. 1000 independent simu- 
lated tomographically complete experiments were performed, 
on four different Werner states - separable, barely separable, 
slightly entangled, and highly entangled. A is shown for each 
trial (points), and averaged over all 1000 trials (solid lines). 
For small N the experiment cannot reliably distinguish them. 
As N grows, it resolves shorter distances in the state space. 
For entangled states, typical values of A increase linearly with 
N, whereas the separable state almost certainly yields A = 
[not visible in these plots; for pq^o.25 (black), all trials with 
more than N ^ 10^ measurements yielded A = 0, and the 
average (dashed line)plunges off the graph]. For barely sep- 
arable states, A behaves as a semi-x^ variable with /c = 1 as 
N ■ ~ 



00 (see Fig. |3|. 



FIG. 3: Distribution of A for a SIC-POVM experiment. 

We show the empirical complementary cum^ulative distribution 
function of A, CCDF{Xc) = Pr(A > Ac), for the state pq^i/3 
and simulated datasets of size AT = {10 ... 10^}. The CCDF 
is used to compute confidence levels - e.g., to report entan- 
glement at the 95% confidence level, it is necessary to observe 
A such that CCDF{X) < 0.05. For this particular state, the 
chance of a zero A approaches 50% as N increases. For each 
N, CCDF(Xc) was based on roughly 10^ data points from 
independent trials, each of which generated a value of A from 
N tomographically complete measurements on Pg^i/3. We 
also show CCDFs for a semi-xf variable and a Xdim(A4) — X15 
variable. The semi-Xi ansatz is good for large N, but unre- 
liable for small N (yielding too many false positives), while 
the Xi5 ansatz is very conservative. 



at a particular state, but it can rapidly and conclusively 
demonstrate entanglement. 

Implementation: Computing A involves maximizing 
C{p) over two convex sets (the set of all states, and the set 
S of separable states). C{p) is log-convex, so in principle 
this is a convex program. 

Testing separability is NP-hard, so efficient minimiza- 
tion over p G <S is impossible in general. But for two 
qubits, the positive partial transpose (PPT) criterion 
perfectly characterizes entanglement, and A can be calcu- 
lated easily (see examples below). For larger systems, S 
can be bounded by simpler convex sets, as <S_ C S C <S+, 
(e.g., 5+ = PPT states, and 5_ = convex combinations 
of specific product states). Maximizing C{p) over 5+ and 
S- yields bounds on maxp^5£(p), which may (depend- 
ing on how wisely the bounding sets were chosen) be tight 
enough to confirm or deny entanglement. 
Examples: To demonstrate the likelihood ratio test, we 
simulate two different experiments on two qubits. We 
imagine an experimentalist trying to produce the singlet 
state 1^), and producing instead a Werner state [8], 



Pq = ^nsinglet + (1 " ^)l/4, 



(6) 



mentalist's repeated preparations are assumed to be in- 
dependently and identically distributed (i.i.d.) [9]. 

Witness data: The simplest way to test for entangle- 
ment is to repeatedly measure a single entanglement wit- 
ness [2, 4 . An optimal witness for Werner states is ly = 
1/2 — Ilsingiet- Measuring W yields one of two outcomes 
- "yes" or "no" - corresponding to POVM (positive- 
operator valued measure) elements {Hsingiet, 1 — Hsingiet}- 
The probability of a "yes" outcome is given by Born's rule 
as p = TrpIIsingiet, so p completely characterizes a state 
p for the purposes of this experiment. The data from N 
measurements is fully characterized by the frequency of 



"yes" results, / 



n" 



■yes' 



'/N. As TV ^ oo, / > ^ repre- 



sents definitive proof that (W) < 0, and therefore that p 
is entangled. For finite N, f < ^ means that a separa- 
ble state fits as well as any other, so there is no case for 
entanglement. When / > |, our likelihood ratio quan- 
tifies the weight of the evidence for entanglement. The 
likelihood function depends only on p, as 



C{p) 



C{p) 



-/logp-(l-/)log(l-p)) 



Pv{f\p)=p^f{l-p) 



N{l-f) 



(7) 



where Ilsingiet = I^X^I- Werner states are separable 
when q < 1/3, and entangled otherwise. The experi- 



making this a single-parameter problem. The maximum 
likelihood, attained at p = /, is Crna 



e-NH(f) g^. 



pressed in terms of the data's empirical entropy, 
iJ(/) = -/log/-(l-/)log(l-/). 



(8) 



If / > ^, the most hkely separable state has p = ^, so 



that £sep = 2 ^, which yields 



A = -21og- 



= 2N[\og{2)-H{f)]. 



(9) 



Our numerical explorations (not shown here) confirm 
that for a barely-separable Werner state, A behaves as 
a semi-Xi variable, even for N as low as 20. 

Tomographically complete data: Many 

entanglement-verification experiments measure a 
tomographically complete set of observables on a 
finite-dimensional system (with a heroic example being 
tomography on 8 ions in an ion trap [11 ). Such data 
identify p uniquely as A^ -^ oo, so one can determine 
with certainty whether p is entangled (modulo the 
computational difficulties in determining whether a 
specified p is separable). Analyzing finite data is more 
complicated than in the witness example, for the data 
constrain a multidimensional parameter space. Ad- hoc 
techniques are unreliable, and the likelihood ratio test 
comes into its own. 

We consider an apparatus that applies a SIC (sym- 
metric informationally complete)-POVM [10] to each of 
our two qubits, independently. This measurement (not 



to be confused with a 4-dimensional SIC-POVM) is to- 
mographically complete, has 4 x 4 = 16 outcomes, and 
yields 15 independent frequencies. Unlike W, it has no 
special relationship to Werner states, so any entangled p 
will yield overwhelmingly convincing data as A^ ^ oo. 

We repeatedly simulated N = 10 . . . 10^ measurements 
on a barely-separable Werner state (^^=1/3), and com- 
pared the empirical distribution of A to those of semi-Xi 
and Xi5 random variables (see Figurepl. As N gets large, 
A becomes indistinguishable from a semi-Xi variable. For 
smaller N, this ansatz is too optimistic (and would pro- 
duce excessive false positives), but the X%-i ansatz is 
wildly overcautious. We found that for small A/", A be- 
haves like a semi-xjj variable, with D a bit larger than 1 
(e.g. D^ 1.6 for AT = 100). 

Conclusions: Entanglement verification is easy when 
N -^ 00. In practice. A" is finite and data are never con- 
clusive. Likelihood ratios provide a simple, reliable test 
of significance that can be applied to any experimental 
data. Large values of A are very unlikely to be generated 
by any separable state, but the hardest separable states 
to rule out are on the boundary. For such states, the- 
ory predicts (and our numerics confirm) that A behaves 
like a semi-x^ random variable. If the underlying state 
is separable, Pr(A > x) can be upper bounded using a 
Xdiin(M) distribution, scaling as e~^ for large x. For en- 
tangled states, A grows linearly with A", and will thus 
rapidly become distinguishable from any separable state. 
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